Analyzing In-Memory Hash Joins: Granularity Matters

نویسندگان

  • Jian Fang
  • Jinho Lee
  • Peter Hofstee
  • Jan Hidders
چکیده

Predicting the performance of join algorithms on modern hardware is challenging. In this work, we focus on mainmemory no-partitioning and partitioning hash join algorithms executing on multi-core platforms. We discuss the main parameters impacting performance, and present an effective performance model. This model can be used to select the most appropriate algorithm for different input data-sets for current and future hardware configurations. We find that for modern systems an optimized no-partition hash join often outperforms an optimized radix partitioning hash join.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cache-Oblivious Hash Joins

Partitioning has been used to improve the performance of the hash join in the main memory; however, cache-conscious partitioning requires the knowledge about the cache parameters, such as the capacity and unit size, of a chosen level of the CPU caches, e.g., the L2 cache. Obtaining this knowledge and subsequently tuning the algorithm may be inconvenient, and sometimes infeasible, for complex sy...

متن کامل

Memory-Efficient Hash Joins

We present new hash tables for joins, and a hash join based on them, that consumes far less memory and is usually faster than recently published in-memory joins. Our hash join is not restricted to outer tables that fit wholly in memory. Key to this hash join is a new concise hash table (CHT), a linear probing hash table that has 100% fill factor, and uses a sparse bitmap with embedded populatio...

متن کامل

An Adaptive Hash Join Algorithm for Multiuser Environments

As main memory becomes a cheaper resource, hash joins are an alternative to the traditional methods of performing equi-joins: nested loop and merge joins. This paper introduces a modified, adaptive hash join method that is designed to work with dynamic changes in the amount of available memory. The general idea of the algorithm is to regulate resource usage of a hash join in a way that allows i...

متن کامل

Memory-Contention Responsive Hash Joins

In order to maximize system performance in environments with fluctuating memory contention, memory-intensive algorithms such as hash join must gracefully adapt to variations in available memory. Mixed workloads, creating fluctuations of erratic frequency and magnitude, make responsiveness to memory contention particularly important. Previous studies on adaptable hash joins have focused on lower...

متن کامل

On a Three-Way Hash Join Algorithm

We develop hash-based algorithms for computing a three-way join. The method involves hashing all three relations into buckets, and then joining buckets in main memory, three buckets at a time. Comparing to two-cascaded hash joins, the algorithms avoid materializing an intermediate result. We present a cost model for this approach, from which we identify the range of parameters for queries that ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017